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Abstract 


Echolocating bats produce very diverse vocal signals for echolocation and social communi- 
cation that span an impressive frequency range of 1 to 120 kHz or 7 octaves. This tremen- 
dous vocal range is unparalleled in mammalian sound production and thought to be 
produced by specialized laryngeal vocal membranes on top of vocal folds. However, their 
function in vocal production remains untested. By filming vocal membranes in excised bat 
larynges (Myotis daubentonii in vitro with ultra-high-speed video (up to 250,000 fps) and 
using deep learning networks to extract their motion, we provide the first direct observations 
that vocal membranes exhibit flow-induced self-sustained vibrations to produce 10 to 95 
kHz echolocation and social communication calls in bats. The vocal membranes achieve 
the highest fundamental frequencies (f,’s) of any mammal, but their vocal range is with 3 to 
4 octaves comparable to most mammals. We evaluate the currently outstanding hypotheses 
for vocal membrane function and propose that most laryngeal adaptations in echolocating 
bats result from selection for producing high-frequency, rapid echolocation calls to catch 
fast-moving prey. Furthermore, we show that bats extend their lower vocal range by recruit- 
ing their ventricular folds—as in death metal growls—that vibrate at distinctly lower frequen- 
cies of 1 to 5 kHz for producing agonistic social calls. The different selection pressures for 
echolocation and social communication facilitated the evolution of separate laryngeal struc- 
tures that together vastly expanded the vocal range in bats. 


Introduction 


The evolution of powered flight, echolocation, and subsequent fast buzzing allows bats to hunt 
and capture fast-moving airborne prey and thereby exploit the riches of the night: flying 
insects [1,2]. To detect small prey, biosonar signals need to contain high frequencies to provide 
efficient acoustic reflection and high bandwidth to provide high localization accuracy and spa- 
tial resolution [3]. Echolocation thus selects for increased fundamental frequency, f, and 
expansion of the f, range, and many species of bats (FM bats) produce precisely timed, fre- 
quency-modulated echolocation calls that sweep in f, from as high as 125 kHz down to 
approximately 10 kHz in calls of only 1 to 2 ms duration [3-5]. Some species have calls with f, 
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up to 250 kHz [6], and, as such, bats produce the highest known voiced f, of all mammals. In 
addition, many bat species produce social communication calls [7] of which some extend their 
fo range further down to 1 kHz [8,9]. Thus, bats can produce very diverse signals that span an 
impressive f, range of 1 to 120 kHz or 6 to 7 octaves, while humans and other mammals typi- 
cally only produce 3 octaves and in exceptional cases 4 to 5 [10]. The tremendous vocal range 
of bats is unparalleled in mammalian sound production, but how bats achieve this remains 
unknown. 

Bat calls are produced laryngeally as in most mammals [11,12], but in bats, the vocal folds 
exhibit several adaptations compared to the generalized mammalian vocal fold, likely associ- 
ated with echolocation requirements [13-16]. First, the paired vocal folds end in 6 to 10 
micrometer thin, apical vocal membranes [15]. Such membranes have been reported in bats, 
cats, and nonhuman primates and have been suggested to act as low-mass oscillators that can 
vibrate almost independently of the vocal fold proper, and thereby support the production of 
high-frequency vocalizations [17-19]. In marmosets, the first direct observation of vocal mem- 
brane vibration showed that they can indeed vibrate at frequencies up to 9 kHz [19]. However, 
in bats, we lack direct observation of vocal membrane vibration and their function in vocal 
production remains untested. Second, a second smaller apical membrane points downwards 
from the ventricular folds, the ventricular membrane [15]. In general, the ventricular folds 
have received very little attention in comparative bioacoustics, and while their altered geome- 
try in bats suggests function, this remains untested. 

A combination of different laryngeal structures may serve to facilitate the tremendous f, 
range and different call types bats produce. Indeed, other mammals, such as marmosets can 
switch from vocal fold to vocal membrane vibration over postnatal development [19]. In 
humans, ventricular folds play a role in several low-frequency forms of singing, such as death 
metal grunting and Tuvan throat singing, where they can touch the vocal fold and increase the 
mass of the oscillating structures [20]. This results in a much lower f, than can be achieved by 
the vocal folds alone [20,21]. Additionally, human vocal folds can exhibit different oscillation 
regimes in the different voice registers, such as vocal fry, chest, and falsetto that expand the 
vocal range [22,23]. In an excised bat larynx preparation, abrupt changes in acoustics were 
attributed to such register jumps [11]. However, no direct evidence exists as to if and how 
laryngeal and ventricular structures can vibrate to produce sound in bats due to challenges of 
imaging the vocal folds in vivo at the extremely high speeds required. 

Here, we test the hypothesis that specialization of different laryngeal structures supports the 
extreme frequency range of FM bats. We test this hypothesis in Daubenton’s bats (Myotis dau- 
bentonii) that have an extreme 77 octave f, range from 1 to 95 kHz [9,24]. 


Results 


While all echolocating bats are assumed to have apical vocal membranes on their vocal folds, it 
is only established in <10 out of approximately 1,100 species [15,16]. We confirmed the pres- 
ence of vocal membranes in M. daubentonii by extracting the fresh larynx from 5 individuals. 
Visual inspection through the narrow epiglottic opening showed the presence of vocal and 
ventricular folds separated by the laryngeal ventricle, aka the ventricle of Morgagni, in all indi- 
viduals (Fig 1). We confirmed the presence of thin apical vocal membranes extending cranial 
on the vocal fold and caudal on the ventricular fold. The overall laryngeal anatomy includes 
several unique adaptations compared to a generalized mammalian larynx as described for 
Eptesicus fuscus [15,16,25]; (1) a hypertrophied laryngeal musculature, particularly the cri- 
cothyroid muscle; (2) a large cricothyroid membrane; and (3) calcified cricoid and thyroid 
cartilages. 
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Fig 1. Specialized anatomy of the microchiropteran bat larynx. (a) Sagittal cross-section of the larynx of M. daubentonii. The paired vocal and ventricular folds with 
apical membranes are suspended between arytenoids and thyroid. (b) The mechanism of sound production is hypothesized to be airflow-induced vibration of the vocal 
membranes (left). Contraction of the cricothyroid muscle rotates the thyroid around the cricoarytenoid joint (right) and thereby elongates and increases tension in the 
vocal folds and membranes (+), which in turn increases the f,. (c) Detail of left ventricular and vocal folds with thin apical membranes. AR, arythenoid; CR, cricoid; 
THY, thyroid cartilage. 


https://doi.org/10.1371/journal.pbio.3001881.g001 


We next mounted these larynges in an excised larynx setup (Fig 2A, see Materials and 
methods) [19,26-28]. After approximation of the ventricular folds with micromanipulators, 
increasing the bronchial pressure induced self-sustained vibration of the ventricular folds (Fig 
2B and 2C and S1 Movie) in 3 out of 3 individuals. We could not see past the ventricular folds 
and could thus not observe if the vocal folds and vocal membranes also vibrated. The glottal 
opening was darker than the ventricular folds and the glottal opening dynamics could reliably 
be extracted with a simple threshold method (Fig 2B). The f, of these oscillations laid on the 
identity line of the sound and vibration f, (Fig 2F), strongly suggesting this vibration caused 
the sound pressure signal. The f, of ventricular fold vibrations ranged between 1 and 3 kHz. 

To allow visual access to the vocal folds and vocal membranes, we removed the ventricular 
folds by carefully cutting through the ventricle of Morgagni. After approximating the vocal 
folds, increasing the bronchial pressure induced self-sustained oscillations of the vocal mem- 
branes in 4 out of 4 individuals. To ensure accurate capture of the fast motion of the vocal mem- 
branes, we filmed their motion at framerates up to 250,000 fps. We never observed vibration of 
the vocal folds, only of the vocal membranes. Simple threshold detection did not reliably extract 
the moving tips of the transparent vocal membranes that passed over the underlying vocal fold; 
therefore, we trained a neural network for posture analysis (DeepLabCut, see Materials and 
methods) that reliably detected the vocal membrane edge along the glottal opening in millions 
of frames (S2 Movie). The f, of these oscillations increased linearly with the f, of the produced 
sound (Fig 2F and S1 Table) at a range of 10 to 20 kHz. 
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Fig 2. Ventricular folds and vocal membranes produce sound and in different frequency ranges. (a) Sketch of excised larynx setup (see Materials and 
methods). (b) Video still of ventricular fold vibration filmed at 25,000 fps with annotated glottal boundary (red) using gray threshold. (c) Glottovibrogram of 
ventricular fold vibration (left) and detail (right) color coded for glottal width. (d) Video still of vocal membrane filmed at 125,000 fps. The edge of the vocal 
membrane was detected using the deep learning network (DeepLabCut). Colored traces depict movement of 10 locations along the vocal membranes that were 
detected along the vocal membrane tip. (e) Glottovibrogram of vocal membrane vibration (left) and detail (right). (f) The f, of vibration equals that of produced 
sound (S1 Table) showing that the structures generate the sound, but in distinct frequency ranges. Colors indicate individuals; dotted gray line is the identity 
line. The data underlying c, e, and f can be found in S1, S2, and $3 Data files. 


https://doi.org/10.1371/journal.pbio.3001 881 .g002 


An important characteristic of myoelastic-aerodynamic systems [29] is the minimal pres- 
sure, aka phonation threshold pressure (PTP), needed to induce a behavior state change of the 
dynamical system from steady state to oscillating limit cycle. A lower PTP means a more effi- 
cient energy conversion from air flow to acoustic pressure [30]. To accurately measure the 
PTP in excised larynx or syrinx experiments, a slow increase of bronchial pressure is typically 
applied [31]. Indeed, we could determine the PTP of the ventricular folds during slow 1 kPa/s 
increases to be 3.99 + 0.87 kPa (N = 4). 

However, for vocal membrane vibration, the onset requirements were strikingly different. 
Interestingly, we could not consistently induce self-sustained oscillation of the vocal mem- 
branes at slow pressure ramps (Fig 3A). Only when we drove the larynges with faster pressure 
patterns that more closely resembled in vivo pressure pulses measured in E. fuscus [14] did we 
succeed in consistent induction of vocal membrane oscillation and thereby sound (Fig 3B). Of 
the 5 specimens, we applied both slow and fast pressure ramps, all 5 showed vocal membrane 
oscillation onset during fast pressure ramps, but only 1 during slow. The PTP was 3.23 + 1.41 
kPa (N = 5) at a pressure rate change of 130.8 + 85.0 kPa/s. Thus, for bat vocal membranes, 
two pressure requirements need to be met for the system to bifurcate to stable limit cycle oscil- 
lation: a minimal pressure and a high pressure rate of change. 

At first approximation, the distinct f, ranges for vocal membranes (10 to 20 kHz) and ven- 
tricular fold (1 to 5 kHz) produced by the different structures in the larynx in vitro correspond 
to the f, ranges of the distinct call types used by most Vespertilionids; echolocation versus 
social communication calls. However, echolocation calls in M. daubentonii can extend much 
higher with f,’s up to 95 kHz [24]. Fundamental frequency control in bats, and mammals in 
general, is mostly achieved by contracting the CT muscle (Fig 1B) [2,11,14]. We mimicked 
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Fig 3. Oscillation onset of vocal membranes requires fast pressure modulation compared to ventricular folds. (a) Sound 
spectrograms during slow 1 kPa/s bronchial pressure show that vocal membranes did not oscillate during slow ramps (in 3 
out of 4 individuals). (b) Driven by fast pressure modulation, vocal membranes reliably vibrated. Red vertical dashed lines 
show the onset for PTP detection. The data underlying a and b can be found in S4 and S5 Data files. 


https://doi.org/10.1371/journal.pbio.3001 881 .g003 


contraction of the CT muscle by rotating the cricoid cartilage caudal (See Materials and meth- 
ods), thereby lengthening an increasing tension in both vocal folds and vocal membranes (Fig 
1B). Indeed, this rotation led to an upward extension of the f, range to 70 kHz. Thus, in vitro 
the vocal membranes oscillated from 10 to 70 kHz, which overlaps well with the f, range in 
vivo of both echolocation and several types of social calls of M. daubentonii (Fig 4A). 

Next, we recorded low-frequency agonistic calls of 3 individuals of M. daubentonii. These 
very short (<2 ms) calls are often described as broadband, noisy sounds [8], but they have a 
harmonic structure of which the f, distribution was 1 to 5 kHz (Fig 4B, see Materials and 
methods). Thus, the f, range of in vivo agonistic social calls overlaps with the in vitro vibration 
of the ventricular folds (Fig 4B), which strongly suggests that these structures are responsible 
for the generation of low-frequency agonistic calls. 


Discussion 


By filming the bat larynx in vitro with ultra-high-speed video up to 250,000 fps and using deep 
learning networks to extract vocal membrane motion, we provide the first direct observations 
that vocal membranes exhibit flow-induced self-sustained vibrations to produce echolocation 
calls in Daubenton’s bats. Furthermore, we show that both vocal membrane and ventricular 
folds vibrate to produce sound and at distinctly different frequency ranges. The vocal mem- 
branes generate 10 to 70 kHz high frequencies in the echolocation and social call range, while 
the ventricular folds produce 1 to 5 kHz low-frequencies in the range of agonistic social calls. 

Mammalian vocal membranes have been hypothesized to serve 3 specific purposes [18] that 
we can now test experimentally on bats. Firstly, vocal membranes supposedly increase f, by 
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Fig 4. The vocal range of laryngeal structures in vitro corresponds to frequency ranges of distinct social calls in Daubenton’s bat. (a) Vocal membrane f, 
range in vitro (blue vertical bar) compared to reported in vivo range for social [8] and echolocation calls [24] of M. daubentonii. (b) Ventricular fold f, range in 
vitro (green vertical bar) corresponds well to f, range of agonistic social calls of M. daubentonii. Boxplot whiskers indicate range. For values see S2 Table. Inset 
shows a spectrogram and oscillogram of an agonistic social call. Abbreviations as in Fig 1. The data underlying a and b can be found in S6 Data. 


https://doi.org/10.1371/journal.pbio.3001881.g004 


uncoupling the vocal membrane vibration from vocal fold vibration. Our data confirm high- 
frequency vocal membrane vibration and the in vitro range f, without CT modulation (10 to 
20 kHz) correspond well to the in vivo range of 8 to 20 kHz after bilateral ablation of the supe- 
rior laryngeal nerve in E. fuscus [14]. In contrast to vocal membranes in marmosets [19], we 
observed that bat vocal membranes vibrated completely uncoupled from the vocal folds and 
did not observe any vocal fold motion at all. Second, vocal membranes can supposedly reduce 
the PTP and thereby increase vocal efficiency. Our experimental data contradicts these model- 
based suggestions. The vocal membranes had a PTP of 3.22 + 1.41 kPa in vitro, which com- 
pares well to PTP in vivo 2.5 to 4.0 kPa in E. fuscus [14]. This species is twice the weight of M. 
daubentonii and thus its PTP may deviate from M. daubentonii. However, when comparing 
across mammals, such PTP values are, if anything, on the high side and certainly not lower. 
The unsteady aerodynamic conditions required to initiate vocal membrane vibration are fasci- 
nating. Low Reynolds number airfoils show peaks in drag and lift coefficients due to rapid 
acceleration of relative airspeed [32], which are preceded by the maximum acceleration points 
[33] in a manner that mirrors the pressure speed profiles preceding the vocal membrane vibra- 
tion onset in this study. Although the flow conditions are different and our observations are 
preliminary, they emphasize the need for further investigation of the role of unsteady aerody- 
namics effects in bat vocalizations. Thirdly, the vocal membranes supposedly support the pro- 
duction of broadband chaotic signals via increased oscillatory coupling [18]. Our data does 
not support this hypothesis in bats either. We did not observe mechanical coupling between 
vocal folds and vocal membranes, and although we did not quantify this specifically, we did 
not observe deterministic chaotic signals. 

The role of the peculiar ventricular apical membranes remains unclear. The ventricular and 
vocal membranes form a drumhead with a narrow slit over the ventricle of Morgagni [15], this 
configuration opens to the hypothesis that the ventricle of Morgagni acts as a cavity that 
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generated a shallow cavity whistle [26] for echolocation calls. However, our data clearly shows 
that removing the ventricular folds and membranes—and thereby the ventricle—results in 
high-frequency sounds by vocal membrane oscillation. Therefore, they were not essential for 
sound production, but this does not exclude that they play a role. Perhaps, the ventricular 
membranes are coupled to vocal membrane oscillation during echolocation calls. Unfortu- 
nately, we could not directly observe the ventricular membranes in our experiments as they 
were either obscured by the ventricular folds and removed—together with the ventricular 
folds—when observing the vocal membranes and folds. Direct observations in vitro could 
involve a hemilarynx experiment, where the larynx is halved and closed by a glass plate 
through which the oscillations can be observed [34]. 

Anatomical adaptations in the bat larynx, such as the ossified cricoid and thyroid in combi- 
nation with hypertrophied muscles, are purportedly adaptations to high pressures in the larynx 
during sound production [25]. However, an acoustic pressure of maximally 200 Pa (= 140 dB 
re. 20 uPa) [35] and maximal 8 kPa bronchial air pressures [14] do not exert much stress on 
bony structures with tensile strengths in the MPa range [36]. Instead, we propose that the ossi- 
fication results from a strong selection on these structures to reduce weight while maintaining 
structural strength. The superfast CT muscles can power the rapid motion needed during feed- 
ing buzzes, but their speed trades off with force [2,37,38]. As a result, superfast muscles are 
exceptionally weak [37] and produce over 50 times lower tetanic stresses compared to normal 
skeletal muscles [39]. Muscular hypertrophy can partially compensate for the low area-specific 
force of superfast vocal muscles in bats [2] as it increases the cross-section area and thus the 
total force. 

Taken together, we propose the evolutionary scenario that many laryngeal morphological 
adaptations in echolocating bats are the result of selection for producing (1) high-frequency 
and (2) rapid echolocation calls to catch fast moving prey. This scenario would be concurrently 
followed by a complimentary specialization of the auditory system that affords bats sensitive 
hearing at high frequencies and over a wide frequency range [40]. First, a strong selection to 
increase spatial resolution [3] led to an increase in f, by reducing the mass of the vibrating 
vocal membranes. Second, a strong selection to increase call repetition rate led to very low 
muscle force [2]. The reduced force was compensated by higher cross-sectional area (CSA), 
ie., a hypertrophied muscle, and the actuated mass was reduced to require less force: The 
vocal folds reduced in mass and both thyroid and cricoid reduced in size and became ossified 
to withstand large bending moments during acceleration. Lastly, the reduced thyroid was 
replaced by the cricothyroid membrane to have a flexible, airtight trachea. Taken together, 
these adaptations allowed the production of ultrasonic calls with fast FM that could be 
repeated above 200 Hz for catching erratic airborne prey in the dark. 

The vocal membranes achieve unparalleled high voiced f, in bats. However, the vocal range 
of vocal membrane produced echolocation calls with 10 to 95 kHz in Daubenton’s bat is only 3 
to 4 octaves and thereby comparable to other mammals [10]. When considering only vocal 
membrane produced sounds, we expect the vocal range for all bats to fit within 3 to 4 octaves. 
As a consequence, we do not expect the material properties of the vocal membranes to be sig- 
nificantly different. However, because smaller strains in muscles allow faster motion, an 
increased stiffness would require a smaller range of motion to achieve the same vocal range 
[10]. Therefore, a stiffer vocal membrane would allow faster FM and call repetition rates at the 
same frequency bandwidth, but this remains to be tested. 

There is only limited known ways to lower f, for mammals. First, vocal folds can exhibit dif- 
ferent vibratory patterns, aka registers, due to differential posturing by laryngeal muscles 
[22,23]. In humans, the lowest register is the vocal fry register. The excised horseshoe bat lar- 
ynx produced distinctly different frequencies that were suggested to be different registers [11], 
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but no laryngeal dynamics were measured to confirm this. In contrast, our data suggest that in 
FM bats, echolocation calls and agonistic social calls are not caused by different vocal mem- 
branes registers, but by using different laryngeal structures. The mechanism by which ventric- 
ular folds decrease f, in other mammals is by coupled oscillation to vocal folds, as in tigers 
[41], grunting pigs [21], human throat singing [42], and metal growling [43]. In our prepara- 
tion, we did not see vocal fold vibration in any condition and were not able to observe vocal 
folds during ventricular fold oscillation. As such, we cannot be conclusive that the lower f, is 
the result of mechanical coupling between laryngeal structures. However, because we could 
not get the vocal folds to oscillate, we venture to speculate that in bats, the ventricular folds 
have taken on the role of lower frequency vibrations. 

An additional effect of high f, is highly directional sound emission, i.e., sound pressure 
attenuates rapidly at angles away from the main broadcast axis. This has substantial benefits 
for navigation through echolocation [44], but likely becomes disadvantageous for social com- 
munication as the sender generally wishes to broadcast as broadly as possible depending on 
the context [45]. Thus, there likely is a strong opposing evolutionary drive for echolocation 
calls versus social calls. Echolocation favors high frequencies for spatial resolution and high 
directionality, while communication favors low frequencies for low directionality and low 
atmospheric attenuation. This duality may then have facilitated the evolution of separate vocal 
sub-structures with distinctly different sound producing purposes in bats. Likewise, fruit bats 
of the genus Rousettus echolocate by tongue clicks and communicate via laryngeal sounds 
[46], indicating a similar duality between echolocation and social call production. Together, 
the different mechanisms vastly expand the vocal range in bats and provide a rich substrate for 
vocal communication. 


Materials and methods 
Subjects 


We used the larynges of 8 adult specimens of M. daubentonii in total (6 males, 2 females). Ani- 
mals were caught under license 2020-9239 from the Ministry of Environment. Animals were 
housed in bat keeping facilities at 11L:13D photoperiod at approximately 22°C and 60% rela- 
tive humidity. All experiments were conducted at the University of Southern Denmark and 
were in accordance with the Danish Animal Experiments Inspectorate (Copenhagen, 
Denmark). 


Larynx dissection and preparation 


All animals were euthanized with isoflurane (Baxter laboratories). The trachea, larynx, and 
surrounding tissue were dissected in ice-cold oxygenated buffer (150 mM NaCl, 2.5 mM KCl, 
4 mM CaCl2, 1 mM NaH2P04, 1 mM MgSO4, 10 mM HEPES, 12 mM Glucose, pH 7.4 
adjusted with a 1 M Trizma solution). Five specimens (MD10, MD11, MD21, MD22, and 
MD23) were flash-frozen in liquid nitrogen and stored at —80°C. Two specimens (MD13 and 
MD14) were used fresh in the setup described below. For 1 specimen (MD 12), the larynx was 
transferred to a sylgard-covered petri dish on ice for inspection under a stereomicroscope 
(M165-FC, Leica Microsystems). This specimen was then also flash-frozen in liquid nitrogen 
and stored at —80°C. Later, this specimen was thawed and fixed in 4% PFA on a roller for 
cross-sections. 

Before an experiment, we thawed the tissue in a refrigerator and then submerged it in 
refrigerated ringer’s solution in a dish on ice and removed additional tissue surrounding the 
larynx and trachea. We then mounted the larynx on a rounded, blunted 21G needle (Sterican, 
0.8 x 40 mm). The larynx was slid over the blunt needle until the caudal edge of the cricoid 
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touched the tube exit and secured with a 10 to 0 monofilament suture (AroSurgical Instru- 
ments, California, United States of America) around the trachea. 


Experimental setup 


We mounted the larynges in the excised larynx setup described previously [26,27]. The setup 
allows for running humidified air through the larynx at precisely controlled pressures (model 
PCD, Alicat Scientific) while controlling the configuration of the larynx with micromanipula- 
tors and recording any sound produced. For recording the sound, we used a 1/4-inch pressure 
microphone-preamplifier assembly (model 46BD, frequency response + 1 dB 10 Hz to 25 kHz 
and + 2 dB 4 Hz to 70 kHz, G.R.A.S., Denmark). The positions of the larynx and microphone 
were fixed relative to each other during an experiment and placed horizontally at 22 to 44 mm 
from the larynx. The microphone signal was amplified (12AQ, G.R.A.S., Denmark) and cali- 
brated before each experiment (Calibrator 42AB, G.R.A.S., Denmark). The sound, pressure, 
and flow signals were low pass filtered at 100, 10, and 10 kHz, respectively (filter model EF502 
low pass filter DC- 100 kHz and EF120 low pass filter DC- 10 kHz, Thorlabs, USA), and digi- 
tized at 250 kHz (USB 6259, 16 bit, National Instruments, Austin, Texas, USA). 

To capture the laryngeal configuration during the experiments, we used a Leica DC425 
camera mounted on the stereomicroscope, controlled using LAS (Leica Application Suite Ver- 
sion 4.7.0, Leica Microsystems, Switzerland). To record tissue vibration, we used a high-speed 
camera (FASTCAM SA1.1, Photron, Tokyo, Japan) filming at 10,000 to 20,000 fps for ventric- 
ular folds and 100,000 to 250,000 fps for vocal membranes, controlled by Photron FASTCAM 
Viewer 4. For illumination, we used a Leica GLS150 lamp through a liquid light guide con- 
nected to the stereomicroscope (static images) or a Thorlabs plasma light source (HPLS200 
Series) (high-speed-imaging). All control and analysis software were written in MATLAB 
(MathWorks). 


Excised larynx phonation protocol 


We removed the epiglottis to give an unobstructed view of the ventricular folds and make 
adduction of the arytenoids easier. To induce ventricular fold vibration, we applied a linear 
increase in bronchial pressure from 0 to 6 kPa at a speed of 1 kPa/s. We wanted to minimize 
the amount of air flowing over the delicate laryngeal structures to prevent them from drying 
out. Because the PTP values were rather high, we did not always start at 0 kPa, but sometimes 
at 3 kPa. Ventricular fold vibration was induced in 4 larynges (MD10, MD11, MD13, and 
MD23). We then turned on the plasma light source and repeated this ramp while triggering 
the camera when the pressure was passing the PTP. In 3 of these (MD11, MD13, and MD23), 
we successfully filmed their vibration. 

To expose the vocal membranes, we carefully cut in a horizontal plane between the ventricu- 
lar and vocal membranes with adventitia scissors (S&T surgical instruments, Switzerland) 
through the ventricle of Morgagni. To induce their vibration, we applied a slow pressure ramp 
from 0 to 7 kPa at 1 kPa/s. This type of pressure function only yielded oscillation for 1 out of the 
first 4 individuals, and we did not apply it for the last 2 to minimize experimental time. Next, 
we applied a sequence of 4, 300 ms duration fast pressure modulation between 0 and 4 kPa. 
This readily resulted in oscillation in 5 specimens (MD10, MD11, MD13, MD14, and MD23). 
Because we needed to film at rates up to 250,000 fps, we only had short buffer available and 
sometimes needed several runs to trigger the camera during vocal membrane vibration with 
correct lighting conditions. We successfully filmed vocal membrane oscillation in 4 animals. 

To increase the f, of the vocal membrane vibrations, we mimicked cricothyroid muscle con- 
traction. We applied 5 to 7 kPa pressure for 1.5 seconds and manually rotated the thyroid 
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downward to increase the tension of the vocal fold and membrane in 5 individuals (MD11, 
MD13, MD14, MD22, and MD23). Since the yin algorithm tends to fail for f,’s above 1 quarter 
of the sampling rate [47], we instead extracted them using the time frequency ridge detection 
function in MATLAB (tfridge) on spectrograms of the sound signal (nfft = 2,048, over- 

lap = 50%, Hamming window) [26]. 


Glottovibrogram construction 


Each video was rotated to make the glottal midline vertical and cropped around the glottis. We 
then calculated the opening of the vocal folds as a function of anterior-posterior position (AP) 
and time, ie., the glottovibrogram (GVG), by automated detection of the glottis shape per 
image. For the ventricular folds, the glottis was defined as all pixels below a manually set 
threshold gray value. The resulting logical image was horizontally and vertically dilated with a 
2-pixel line (imdilate function in MATLAB) and filled (imfill), which resulted in an outline of 
the glottis. The glottis width was the sum of the vertical opening pixels scaled for magnifica- 
tion. To determine the position of the vocal membrane edges, we could not use a simple image 
grayscale threshold, because the vocal membranes were too translucent, and the trailing edge 
was crossing the underlying vocal folds with nearly the same pixel values. This led to erroneous 
detection of the thin vocal membrane parts as glottis. Instead, we trained a deep learning 
model to detect the vocal membrane edges using the deep learning Python package DeepLab- 
Cut (2.2b) [48,49]. We digitally superimposed 8 to 10 equidistantly spaced dashed horizontal 
lines on the videos and trained the network on detecting where the vocal membrane edges 
crossed these lines. The superimposed lines were used to fix the detections vertically as we 
were only interested in the horizontal movement of the vocal membranes. After training for 1 
million iterations, the videos were analyzed, resulting in pixel coordinates for points along the 
glottal edge for each analyzed frame. 

To calculate the f,, we first determined the anterior-posterior (AP) location where the mean 
opening was maximal. Then, we extracted the opening at this location along the AP axis from 
the GVG. We resampled all other physiological signals (pressure, sound) to the framerate of 
the video (resample function in MATLAB). The f, of the sound and glottal opening signal was 
determined using the yin algorithm [47], combining signal power and aperiodicity criteria to 
extract f, per 10 frames. 


Signal analysis 


To determine PTP and S,,p, we first low pass filtered the pressure signal at 500 Hz with a sixth 
order Butterworth filter (butter and filtfilt functions in MATLAB) to eradicate any high-fre- 
quency fluctuations. The rate or speed of the pressure change was then calculated by first find- 
ing the pressure change between time steps (diff function in MATLAB), this value was then 
multiplied by the acquisition rate (250 kHz) to get the pressure speed (per second rate of pres- 
sure change). We defined PTP and S,,, as the pressure and pressure speed at the time where 
the sound power crossed 0.2 mPa. 


In vivo social call recordings 


Because we could not find detailed quantification of the low-frequency calls of M. daubentonii 
in the literature, we recorded 9 additional males in Odense, Denmark caught under license 
2021-1194. Daubenton’s bats do not spontaneously produce low-frequency calls as easily as, 
e.g., Pipistrellus pygmaeus, and only 3 individuals produced such calls when (1) they were 
joined with others into 1 enclosure after daily weighting; or (2) when stroked roosting in the 
large flight cage at SDU. We recorded calls with an Olympus LS-100 24-bit recorder at 
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sampling rate of 96 kHz and a Grass 40BF 4” microphone connected to a Avisoft 16-bit USG 
at 375 kHz. We selected small segments that included calls and extracted the f, of the sound 
with the yin algorithm [47]. 


Statistics 


All values listed are mean + SD. The correlation between the f, of sound and vocal fold vibra- 
tions was established with linear regression (regress function) in MATLAB (MathWorks). The 
boxplots were constructed using the MATLAB toolbox IoSR (v.2.8, Institute of Sound Record- 
ing, University of Surrey, 2016), with no limit for outliers, meaning horizontal lines indicate 
minimum, maximum, median, and interquartile range. 


Supporting information 


$1 Table. Descriptive statistics of f, regressions vocal membrane vibration versus sound in 
Fig 2F. 
(DOCX) 


$2 Table. Sound f, ranges for different call types and laryngeal performance in vitro in 
Myotis daubentonii. 
(DOCX) 


$3 Table. Phonation threshold pressures (PTP) and pressure speed at PTP (S,;,) in vitro. 
(DOCX) 


S1 Movie. Ventricular fold oscillation during sound production. Individual MD13; filmed 
at 20,000 frames per second. 
(MP4) 


S2 Movie. Vocal membrane oscillation during sound production. The edge of the vocal 
membranes are detected at 10 locations with a neural network. Individual MD 13; filmed at 
125,000 frames per second. 

(MP4) 


$1 Data. The data underlying Fig 2C, glottovibrogram of ventricular fold vibration. File 
contains data points for time and AP position axis, as well as the width of the opening in pixels 
and millimeters at the corresponding AP positions. 

(MAT) 


S2 Data. The data underlying Fig 2E, Glottovibrogram of vocal membrane vibration. File 
contains data points for time and AP position axis, as well as the width of the opening in pixels 
and millimeters at the corresponding AP positions. 

(MAT) 


$3 Data. The data underlying Fig 2F, f, of vibration and resulting sound for ventricular 
folds and vocal membranes. File contains data points for f, of vibration (x) and sound (y). 
(MAT) 


$4 Data. The data underlying Fig 3A, sound spectrogram during slow 1 KPa/s bronchial 
pressure ramp. File contains data points for the spectrogram frequency, power, and time. 
(MAT) 


$5 Data. The data underlying Fig 3B, sound spectrogram during fast bronchial pressure 
modulation. File contains data points for the spectrogram frequency, power, and time, as well 
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as the onset times. Stored as MATLAB data file (.mat). 
(MAT) 


S6 Data. The data underlying Fig 4A and 4B, vocal membrane f, range in vitro, ventricular 
fold f, range in vitro, and f, range of agonistic social calls of M. daubentonii. File contains 
extracted f,’s. 

(MAT) 
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